bias score
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom > England (0.04)
- Asia > China (0.04)
- (3 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.93)
- Law (0.93)
- Banking & Finance > Economy (0.47)
Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models
Park, NaHyeon, An, Namin, Kim, Kunhee, Yoon, Soyeon, Huo, Jiahao, Shim, Hyunjung
Large vision-language model (LVLM) based text-to-image (T2I) systems have become the dominant paradigm in image generation, yet whether they amplify social biases remains insufficiently understood. In this paper, we show that LVLM-based models produce markedly more socially biased images than non-LVLM-based models. We introduce a 1,024 prompt benchmark spanning four levels of linguistic complexity and evaluate demographic bias across multiple attributes in a systematic manner. Our analysis identifies system prompts, the predefined instructions guiding LVLMs, as a primary driver of biased behavior. Through decoded intermediate representations, token-probability diagnostics, and embedding-association analyses, we reveal how system prompts encode demographic priors that propagate into image synthesis. To this end, we propose FairPro, a training-free meta-prompting framework that enables LVLMs to self-audit and construct fairness-aware system prompts at test time. Experiments on two LVLM-based T2I models, SANA and Qwen-Image, show that FairPro substantially reduces demographic bias while preserving text-image alignment. We believe our findings provide deeper insight into the central role of system prompts in bias propagation and offer a practical, deployable approach for building more socially responsible T2I systems.
- Health & Medicine (1.00)
- Transportation > Air (0.68)
- Transportation > Infrastructure & Services (0.46)
Constructing Political Coordinates: Aggregating Over the Opposition for Diverse News Recommendation
Earl, Eamon, Ding, Chen, Valenzano, Richard, Paulen-Patterson, Drai
Abstract--In the past two decades, open access to news and information has increased rapidly, empowering educated political growth within democratic societies. News recommender systems (NRSs) have shown to be useful in this process, minimizing political disengagement and information overload by providing individuals with articles on topics that matter to them. Unfortunately, NRSs often conflate underlying user interest with the partisan bias of the articles in their reading history and with the most popular biases present in the coverage of their favored topics. Over extended interaction, this can result in the formation of filter bubbles and the polarization of user partisanship. In this paper, we propose a novel embedding space called Constructed Political Coordinates (CPC), which models the political partisanship of users over a given topic-space, relative to a larger sample population. We apply a simple collaborative filtering (CF) framework using CPC-based correlation to recommend articles sourced from oppositional users, who have different biases from the user in question. We compare against classical CF methods and find that CPC-based methods promote pointed bias diversity and better match the true political tolerance of users, while classical methods implicitly exploit biases to maximize interaction. Recommender system (RS) utility has two main value measurements: users seeing content that they engage positively with, and the content providers maximizing engagement with their content or platform. While the two are evidently correlated (i.e. a user who is not properly catered to will likely cease to use the platform), the latter provides motivation for recommendation algorithms to shift a user's preferences to make them easier to cater to, resulting in higher expectations of long-term engagement [1]. Previous research [2] on the relationship between recom-mender systems and American political typology suggests that users with more extreme political preferences exhibit higher engagement metrics with their recommended news. Additionally, it was found that their engagement can be maximized by recommending articles among which a dominant percentage express a singular partisan bias. This establishes an implicit incentive for a News Recommender System (NRS) to shift user preferences toward political extremes through selection bias, particularly in long-term value systems or those leveraging popularity [1]. This phenomenon results in the formation of filter bubbles, where users are eventually shown only perspectives in their news which comply with their preexisting opinions, and users with heterogeneous partisanship over distinct topics have their political ideology homogenized over time.
- North America > United States (1.00)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Middle East > Syria (0.04)
- (3 more...)
- Research Report (0.50)
- Overview (0.34)
- Government > Regional Government > North America Government > United States Government (1.00)
- Health & Medicine > Therapeutic Area (0.68)
- Media (0.68)
- Government > Military (0.67)
From n-gram to Attention: How Model Architectures Learn and Propagate Bias in Language Modeling
Kabir, Mohsinul, Tahsin, Tasfia, Ananiadou, Sophia
Current research on bias in language models (LMs) predominantly focuses on data quality, with significantly less attention paid to model architecture and temporal influences of data. Even more critically, few studies systematically investigate the origins of bias. We propose a methodology grounded in comparative behavioral theory to interpret the complex interaction between training data and model architecture in bias propagation during language modeling. Building on recent work that relates transformers to n-gram LMs, we evaluate how data, model design choices, and temporal dynamics affect bias propagation. Our findings reveal that: (1) n-gram LMs are highly sensitive to context window size in bias propagation, while transformers demonstrate architectural robustness; (2) the temporal provenance of training data significantly affects bias; and (3) different model architectures respond differentially to controlled bias injection, with certain biases (e.g. sexual orientation) being disproportionately amplified. As language models become ubiquitous, our findings highlight the need for a holistic approach -- tracing bias to its origins across both data and model dimensions, not just symptoms, to mitigate harm.
- Asia > Thailand > Bangkok > Bangkok (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Self-Adaptive Cognitive Debiasing for Large Language Models in Decision-Making
Lyu, Yougang, Ren, Shijie, Feng, Yue, Wang, Zihan, Chen, Zhumin, Ren, Zhaochun, de Rijke, Maarten
Large language models (LLMs) have shown potential in supporting decision-making applications, particularly as personal assistants in the financial, healthcare, and legal domains. While prompt engineering strategies have enhanced the capabilities of LLMs in decision-making, cognitive biases inherent to LLMs present significant challenges. Cognitive biases are systematic patterns of deviation from norms or rationality in decision-making that can lead to the production of inaccurate outputs. Existing cognitive bias mitigation strategies assume that input prompts only contain one type of cognitive bias, limiting their effectiveness in more challenging scenarios involving multiple cognitive biases. To fill this gap, we propose a cognitive debiasing approach, self-adaptive cognitive debiasing (SACD), that enhances the reliability of LLMs by iteratively refining prompts. Our method follows three sequential steps - bias determination, bias analysis, and cognitive debiasing - to iteratively mitigate potential cognitive biases in prompts. We evaluate SACD on finance, healthcare, and legal decision-making tasks using both open-weight and closed-weight LLMs. Compared to advanced prompt engineering methods and existing cognitive debiasing techniques, SACD achieves the lowest average bias scores in both single-bias and multi-bias settings.
- Europe > Austria > Vienna (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- South America > Brazil (0.04)
- (11 more...)
- Health & Medicine (1.00)
- Banking & Finance (1.00)
- Law (0.88)
A Multi-Component Reward Function with Policy Gradient for Automated Feature Selection with Dynamic Regularization and Bias Mitigation
Static feature exclusion strategies often fail to prevent bias when hidden dependencies influence the model predictions. To address this issue, we explore a reinforcement learning (RL) framework that integrates bias mitigation and automated feature selection within a single learning process. Unlike traditional heuristic-driven filter or wrapper approaches, our RL agent adaptively selects features using a reward signal that explicitly integrates predictive performance with fairness considerations. This dynamic formulation allows the model to balance generalization, accuracy, and equity throughout the training process, rather than rely exclusively on pre-processing adjustments or post hoc correction mechanisms. In this paper, we describe the construction of a multi-component reward function, the specification of the agents action space over feature subsets, and the integration of this system with ensemble learning. We aim to provide a flexible and generalizable way to select features in environments where predictors are correlated and biases can inadvertently re-emerge.
- North America > United States (0.68)
- Asia > Singapore (0.04)
- Europe > Greece (0.04)
- Law (1.00)
- Health & Medicine (1.00)
- Banking & Finance > Credit (1.00)
- Government > Regional Government > North America Government > United States Government (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Probing Social Bias in Labor Market Text Generation by ChatGPT: A Masked Language Model Approach Lei Ding
The complexity of automating bias evaluation in textual content poses significant challenges. Traditional approaches in social sciences, such as content analysis, often rely on manual word counts from static lists [Gaucher et al., 2011], which may miss the subtleties and unlisted language cues that advanced NLP technologies can detect.
- North America > Canada > Alberta (0.14)
- Europe > United Kingdom (0.14)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Overview (0.93)
- Law (0.93)
- Banking & Finance > Economy (0.83)
IndiCASA: A Dataset and Bias Evaluation Framework in LLMs Using Contrastive Embedding Similarity in the Indian Context
S, Santhosh G, S, Akshay Govind, Krishnan, Gokul S, Ravindran, Balaraman, Natarajan, Sriraam
Large Language Models (LLMs) have gained significant traction across critical domains owing to their impressive contextual understanding and generative capabilities. However, their increasing deployment in high stakes applications necessitates rigorous evaluation of embedded biases, particularly in culturally diverse contexts like India where existing embedding-based bias assessment methods often fall short in capturing nuanced stereotypes. We propose an evaluation framework based on a encoder trained using contrastive learning that captures fine-grained bias through embedding similarity. We also introduce a novel dataset - IndiCASA (IndiBias-based Contextually Aligned Stereotypes and Anti-stereotypes) comprising 2,575 human-validated sentences spanning five demographic axes: caste, gender, religion, disability, and socioeconomic status. Our evaluation of multiple open-weight LLMs reveals that all models exhibit some degree of stereotypical bias, with disability related biases being notably persistent, and religion bias generally lower likely due to global debiasing efforts demonstrating the need for fairer model development.
- Asia > India (0.25)
- North America > United States > Texas (0.04)
- North America > United States > Ohio (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Education (0.68)
Open-DeBias: Toward Mitigating Open-Set Bias in Language Models
Rani, Arti, Singh, Shweta, Sahoo, Nihar Ranjan, Nayak, Gaurav Kumar
Large Language Models (LLMs) have achieved remarkable success on question answering (QA) tasks, yet they often encode harmful biases that compromise fairness and trustworthiness. Most existing bias mitigation approaches are restricted to predefined categories, limiting their ability to address novel or context-specific emergent biases. To bridge this gap, we tackle the novel problem of open-set bias detection and mitigation in text-based QA. We introduce OpenBiasBench, a comprehensive benchmark designed to evaluate biases across a wide range of categories and subgroups, encompassing both known and previously unseen biases. Additionally, we propose Open-DeBias, a novel, data-efficient, and parameter-efficient debiasing method that leverages adapter modules to mitigate existing social and stereotypical biases while generalizing to unseen ones. Compared to the state-of-the-art BMBI method, Open-DeBias improves QA accuracy on BBQ dataset by nearly $48\%$ on ambiguous subsets and $6\%$ on disambiguated ones, using adapters fine-tuned on just a small fraction of the training data. Remarkably, the same adapters, in a zero-shot transfer to Korean BBQ, achieve $84\%$ accuracy, demonstrating robust language-agnostic generalization. Through extensive evaluation, we also validate the effectiveness of Open-DeBias across a broad range of NLP tasks, including StereoSet and CrowS-Pairs, highlighting its robustness, multilingual strength, and suitability for general-purpose, open-domain bias mitigation. The project page is available at: https://sites.google.com/view/open-debias25
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Mexico > Mexico City > Mexico City (0.04)
- North America > Dominican Republic (0.04)
- (6 more...)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.67)
- Education (0.68)
- Transportation (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Bias Mitigation or Cultural Commonsense? Evaluating LLMs with a Japanese Dataset
Yamamoto, Taisei, Kumon, Ryoma, Bollegala, Danushka, Yanaka, Hitomi
Large language models (LLMs) exhibit social biases, prompting the development of various debiasing methods. However, debiasing methods may degrade the capabilities of LLMs. Previous research has evaluated the impact of bias mitigation primarily through tasks measuring general language understanding, which are often unrelated to social biases. In contrast, cultural commonsense is closely related to social biases, as both are rooted in social norms and values. The impact of bias mitigation on cultural commonsense in LLMs has not been well investigated. Considering this gap, we propose SOBACO (SOcial BiAs and Cultural cOmmonsense benchmark), a Japanese benchmark designed to evaluate social biases and cultural commonsense in LLMs in a unified format. We evaluate several LLMs on SOBACO to examine how debiasing methods affect cultural commonsense in LLMs. Our results reveal that the debiasing methods degrade the performance of the LLMs on the cultural commonsense task (up to 75% accuracy deterioration). These results highlight the importance of developing debiasing methods that consider the trade-off with cultural commonsense to improve fairness and utility of LLMs.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (12 more...)